Chapter 5

Chapter 5.1. Introduction and Tools

5.1.1. - Introduction
The goal of my DMS prototype is to create a system in which the music is highly adaptive to player input, while maintaining a smooth narrative-driven structure. I will be using Unity as the game engine, Wwise as the audio engine, a custom Unity tool for programming logic processes, and Reaper64 as a digital audio workstation. The custom Unity tool is called the Gatekeeper. The Gatekeeper is a proprietary audio tool built outside of the thesis for use in professional game development, and is owned by Phi Dinh and myself. The Gatekeeper tool itself is not a contribution of the thesis, however, the function it provides is part of what makes my DMS novel. The Gatekeeper is discussed later in this chapter.

The prototype is a game that I designed to allow for creating a DMS that attempts to investigate the conflict between interactivity and musical structure. Phi Dinh helped with the programming and art for the game, but did not contribute to the DMS design, composition, or audio implementation. Code is hooked to various in-game objects and parameters. The code hooks are passed through sequences of logic processes to generate ‘events’ that trigger changes in the music. ‘Events’, as they are called in Wwise, occur when the game engine notifies Wwise that a change has happened in the game that warrants an audio response. Events can control volume, filters, effects, audio cues, and game state changes. Wwise is a free-to-download software with many tutorials and courses online if the reader is inclined to understand its intricacies. The Wwise session for my prototype is included on the USB. I used Reaper64 as a digital audio workstation to create the music and audio assets for the prototype. The music was created using virtual instruments and sample libraries. The music was organized into segments for the DMS in Reaper, then the segments were rendered and imported into Wwise. I have included a folder with all of the individual music assets on the USB.

5.1.2. - Reflection on Case Studies and Goals
Case Study 1 shows how stingers and loops can be used to immediately adapt to player input. The immediate responsiveness of the music often results in loops interrupting each other mid phrase. The sequence of musical segments over time did not emerge as a narrative structure. This is an example of sacrificing musical structure for interactivity.

My prototype will attempt to fix this problem by addressing the use of stingers differently. Stingers within my DMS prototype are used to develop existing musical material in a new direction. Each stinger that occurs is followed by the development of previous material, indicating that the player’s actions are developing the music.

Case Study 2 shows how low levels of adaptability allow for relatively linear compositions to be used. The complaints of the Abyss Watchers’ DMS is that it sacrifices interactivity for a clear linear musical structure. Melodic phrases, harmonies, and rhythms contribute to the music’s development because the music is not required to adapt to in-game parameters (except for when the second phase of the boss fight begins).

My prototype observes this as a benefit to having longer musical segments in the DMS (less granularity). However, there must be a creative intent to satisfy the need for immediate musical change, versus, the space to allow a musical phrase to finish. Where it is necessary to interrupt a phrase, my prototype attempts to provide a horizontal resequencing solution that allows current phrases to finish while also introducing new musical material.

Through the process of creating my prototype I had to come to terms with the fact that it would simply be too much work to write enough music to be able to instantly and seamlessly transition between every segment, at any time. However, it is also unacceptable to have music that is always slow to adapt, like was seen in the Skyrim example in Chapter 3. This is why, even for a DMS as complex as mine that has hundreds of unique music segments, it is important to find a balance between granularity and practicality. After all, the purpose is not to create the most complex system possible, it is to create the illusion that the DMS’s musical structure unfolds as a linear narrative congruent with the player’s narrative progression. Immediate responsiveness is not always required, but the perception of some musical change connected to the player’s sequence of actions is.

Case Study 3 shows a DMS that primarily functions by randomizing small segments to create variable looping sequences. Different areas of the level are accompanied by different musical ideas. Motifs are established and tied to different characters and game states. These motifs are referenced a few times throughout the level, however they are mostly developed during the cut-scenes where the music is linear. The Pillar of Autumn DMS represents something in between the other two. The music adapts in a granular way, has randomized horizontal resequencing to create the feeling of non-repetition, and often concludes musical phrases before cutting them off or transitioning to something else. In general, the random sequence containers that are used only prevent the music from being exactly repetitive. The technique does not provide a means for the music to continuously develop alongside the player’s narrative.

5.1.3. - Gatekeeper Tool
The Gatekeeper is a Unity tool that was developed to program logic processes for audio events. Understanding how the Gatekeeper works is a major part of understanding what enables my DMS to function differently from others. The Gatekeeper is a simple programming tool that allows easy programming of various logic statements. Setting up logic processes with the Gatekeeper requires manual handling, it does not automate any sort of work. Its function could easily be duplicated by a programmer who wanted to achieve the same thing. It is likely that similar tools may already exist for a multitude of different applications. Below is an algorithmic flowchart that describes how the tool works.

The Gatekeeper tool has two components, the first is called the “Gatekeeper”. The second is called the “Music Monitor”. As previously stated, the Gatekeeper part is the visual programming tool that allows me to input various conditions to create logic processes, which generates events based on what the player is currently doing and what music has been previously played. For this to work the Gatekeeper needs to know 1) what the player is doing, and 2) what music has previously been played.

The first piece of information is informed by code attached to different game objects and parameters in the game. For example, if the player enters combat, the code triggers an event to relay what has just happened. The game engine knows this event was sent, so it knows what the player is doing. Typically this event would be sent directly to the audio engine. However, in our case, it is sent to the Gatekeeper instead. The Gatekeeper will run a predetermined string of logic processes before allowing the event to be sent to the audio engine. For example:

Process 1: the player just entered combat; if “music A” most recently played, then trigger “combat music A”; else, run Process 2.
Process 2: the player just entered combat; if “music B’ most recently played, then trigger “combat music B”.

The combat music will be different depending on which process passes. When a process passes the Gatekeeper will generate and send a new event to the audio engine to trigger the correct musical change. This is where the Gatekeeper gets its name. Its function is to prevent the audio engine from receiving events unless they pass a series of tests.

The second piece of information the Gatekeeper needs to know is when specific music segments previously played. This is not a typical feature for contemporary game audio engines like Wwise. However, Wwise does have a feature that allows Unity to register what segments of music are being triggered to play. This is done by placing a marker called a “custom user cue” on each segment in Wwise. When the segment plays, Wwise passes the marker’s name back to Unity. This tool is usually used in game development to trigger animations or visual effects to sync with different parts of the music. I am using the feature to log all of the triggered music segments, their sequence, and their time-stamp. The part of the Gatekeeper that logs all of the triggered segments is called the Music Monitor. The Music Monitor is what provides the Gatekeeper with the information it needs to run logic processes that involve checking for previously played sequences of music.

A simple example would be:

If: PLAYER_DEATH
Then send Wwise event: BRASS_MUSIC

When the player respawns, BRASS_MUSIC will have been previously triggered by their death, and will be playing. The Music Monitor will log each individual segment that is part of the BRASS_MUSIC. For this example, let’s suppose BRASS_MUSIC is split up into ten segments, and the fifth segment, ’SEGMENT_A5’, introduces a previously unheard vocal choir.

So continuing on, we could use the Gatekeeper to program the following:

If: PAYER_DEFEATS_BOSS
Before: SEGMENT_A5
Then send Wwise event: BRASS_VICTORY
Else, run: process 2

Process 2
If: PLAYER_DEFEATS_BOSS
After: SEGMENT_A5
Then send Wwise event: CHOIR_VICTORY

Depending on whether or not the vocal choir was introduced during the gameplay will determine if the victory music continues to develop the previously heard brass music (because the player was never introduced to the choir), or develops the newly heard choir music (because they were). There can also be multiple conditions involving elapsed time, such as:

If: PLAYER_DEFEATS_BOSS
Before: SEGMENT_A5
Before: PLAYER_LOW_HEALTH
After: BOSS_DAMAGED_BY_SHOVEL (SFX cue)
^ - within time range of: 0 - 1 second
Then send Wwise event: PLAYER_IS_AWESOME

This process describes a situation in which the player defeated the boss before hearing the vocal choir, before reaching a status of having low health, and after damaging the boss with a shovel within one second of the boss’s defeat. This implies that the player had plenty of health and used a shovel to defeat the boss, and did so before hearing the choir music. This single event being sent to Wwise can begin to describe the narrative of the player. The player used a shovel, they never heard the choir music in SEGMENT_A5, and they had plenty of health. As a result we can trigger music called PLAYER_IS_AWESOME that has some sort of heroic shovel-slayer music, defining the player’s identity as one that easily kills bosses with shovels. When we trigger the new music called PLAYER_IS_AWESOME we can choose to include choir layers if the choir had been introduced, or in this case, not.

An unexpected richness from using the Gatekeeper is that it is possible to trigger silent musical cues to play when events are received by Wwise. So even though the Music Monitor will log each heard segment of the PLAYER_IS_AWESOME music, we can also trigger a silent music cue with a marker called “PLAYER_IS_AWESOME_M” that signals that that logic process passed. We can now use that unheard Music Monitor log as a condition for future processes:

If: PLAYER_DEFEATS_BOSS_2
After: PLAYER_IS_AWESOME_M
Then send Wwise event: VICTORY_2_SHOVEL_SLAYER
Else, run: process 2

Process 2

If: PLAYER_DEFEATS_BOSS_2
Before: PLAYER_IS_AWESOME_M
Then send Wwise event: VICTORY_2

The use of the information from the music monitor about whether or not other processes previously passed or failed allows the composer to chain together multiple complex processes that describe the player’s journey. This allows the composer to compose, implement, and trigger musical changes dictated by infinitely more specific sequences of player actions.

__________________________________________________

Chapter 5.2. Prototype and Closing Remarks

5.2.1. - Map Design
The prototype is designed to contain different gameplay scenarios that are often found in games. As seen in the map below, the prototype is split into three different areas. Area 1 allows the player to enter three different colored regions from the central zone. These regions can be entered and exited in any order. Within each region there is a key that needs to be collected before the player can progress to Area 2. Area 1 provides the opportunity for non-linear exploration-driven gameplay with an objective to collect the keys. When the player collects all three keys the area is completed and they may progress to Area 2.

Area 2 is more linear, requiring the player to progress into the black region where they must acquire the fourth key and gun. Both are essential for beating the game. The player may attempt to sneak past the enemy or fight them. Even though the exploration is more linear compared to Area 1, there is a more dynamic objective requiring the player to interact with the enemy. After collecting the fourth key the player may continue to Area 3.

Area 3 contains two main events. The first is a boulder that chases the player down a hallway. The player must reach safety before the boulder hits them. If the player successfully dodges the boulder then it will break the final door, allowing the player to progress. The final part of the prototype is a boss fight. The player is unable to access the previous parts of the game after entering the boss fight room. The boss fight presents a type of gameplay where the player is locked into a difficult combat situation with no escape. The only objective is to win. After defeating the boss the game is over.

5.2.2. - Process

5.2.2.1. - Conflict with the Time-Indeterminacy of Player Progression
The duration of gameplay can last as long or short as the player dictates. The Dark Souls 3 and Tomb Raider analyses in Chapter 4 show DMSs that allow music to loop indefinitely to accommodate this. Halo’s Pillar of Autumn DMS tries to prevent the repetition of looping music by limiting the duration it can loop to around five minutes. I believe that music should be respective of both player action and musical progression, so that repetition is not used as a means for waiting on the player, and musical phrases aren’t interrupted by new material. This is at a direct conflict with how people play games. If music is to last indefinitely then there must be an infinite amount of musical material that continuously develops. This is not possible within this paradigm. My solution is that music should be given the opportunity to conclude when the player is no longer progressing the narrative. Different themes and pieces can be continuously developed so as long as the player continuously progresses the narrative before the music reaches a natural endpoint. If the player fails to progress the narrative, then the player’s stagnation is likely to be most congruent with a conclusive silence. When the player resumes the narrative development, the previous musical material can continue developing as a new piece. This type of approach is how I have chosen to handle the problem of time-indeterminacy in gameplay.

5.2.2.2. - Various Approaches and Problems
My initial attempt designing the DMS was to create a system that was as complex as possible. I wanted to compose and render unique music segments to allow for the music to always transition seamlessly within a couple of beats. It was difficult to draft this complex of a multi-structural composition. While attempting to create this DMS I found that there were far too many music segments to manage efficiently (over five-hundred segments for Area 1). I decided it was no longer a practical exercise and did not continue with the DMS. My initial approach made it obvious to me that the ‘ideal’ granularity was not going to be attainable. My conclusion was that the goal of having high levels of interactivity would need to be balanced by the gameplay’s highest priority needs for immediate musical responses.

My second attempt was to compose linear tracks for each game state that I could crossfade between (using markers instead of slices). However, when crossfading was used as a means of horizontal resequencing I was able to hear the music fade, revealing the seams. My conclusion from this attempt was that crossfading between markers was too audible, even when done quickly. The DMS prototype would need to primarily use sliced segments for horizontal resequencing.

My third attempt sought to find a solution to a technical problem regarding how different vertically layered segments simultaneously transition horizontally. For example, the music could have three individual layers: one layer for the piano part, one for the violin part, and one for the clarinet part. I attempted to render all the vertical layers together as a single file. However, if there were musical phrases that overlapped bar lines it became difficult to find places where I could slice the track into segments. The piano part had short phrases that could be sliced every four beats, the clarinet part had sustained notes lasting seven beats, and the violin part had lots of rubato with articulations happening just before the downbeat of each measure. It was possible to disregard the phrasing of the musical material and slice it every four bars. However, if the music was horizontally resequenced to a segment that did not continue the resonant patterns in the audio tail of what was just played, the seam of the transition became apparent. My solution to avoid this problem was to compose the music in such a way that no musical phrases overlapped each other. This made it possible to consistently render and organize all my segments as two-bar phrases. But this resulted in music that I felt was too square, predictable, and boring.

My solution was to allow for vertical music layers to be sliced at different points, offsetting their entry or exit points from one another while still allowing them to simultaneously be horizontally resequenced within the same metric framework. For example, the piano would be able to transition to new material every four beats, the clarinet could transition to the new music three beats later (after seven beats), and the violin could have a slightly earlier exit point to accommodate its rubato phrasing. The result was that when a layer contained short musical phrases it could transition more granularly, and when layers required more time to conclude a phrase they could transition later. This created the illusion of having both granular interactivity and a linear musical structure.

My fourth and final attempt at creating the DMS gave the best results. By prioritizing moments that require more granular interactivity, the musical material preceding moments of immediate change were composed to be suitable for immediate interjections that progressed the development of the music’s structure. The music utilizes a mix both of horizontal and vertical techniques used simultaneously. Vertical techniques are used to add or subtract existing layers of music when an immediate change in musical material is required. Horizontal techniques are used to allow the music to branch through different musical ideas while maintaining a continuous structural development. Different layers were split and rendered with different entry and exit points to allow for a balance between music that can transition quickly, and music that required more time to reach a conclusion before moving to a new idea. I found that different musical characteristics often define how granular the music can be and how it should be organized. Slow sustained string textures with long arching melodies are generally going to have less clearcut entry and exit points, but may be less obvious when crossfaded. Short articulate woodwind passages that contain small melodic fragments are very obvious when crossfaded, but can have much closer entry and exit points.

The conclusion of my final attempt was that within the restriction of practicality, there must be a give and take regarding the balance between musical structure and interactivity. That balance must take the musical style, performance, orchestration, and gameplay into consideration. I found the best way to find this balance was through trial and error. Different game scenarios may require different approaches. My DMS design uses standardized methods and tools, and I believe it is the only DMS that explicitly attempts to address the conflict between structure and interactivity in video game music through these particular methods.

5.2.3. - DMSFRM, Prototype, and Closing Remarks
In the logic conditions shown on the DMSFRM I have referred to each grouping of segments that exist vertically with each other as the base layer, and not included the names of any extra layers. So for example: when A1 transitions to C1, I do not mention all of the layers that occur vertically with C1. If a vertical layer is connected by a dotted line then it will always audibly play simultaneously with the segments it is connected to. If a vertical layer is longer than the segments it is connected to, and a transition occurs at a shorter segment’s exit point, then all layers in the next state will play while the longer segment finishes. If a transition has a fade out time, as described in the logic processes on the lines, then the fade out will silence any segments or layers from the previous state that may still be playing. Fade out times are only mentioned if they exist. Vertical layers connected by solid lines indicate how the layers are added or interchanged when certain conditions are met. These conditions are described on the connecting lines.

The structure of the prototype is that Area 1 primarily uses horizontal resequencing techniques; quantizing segments to beat, bar, phrase, and segment. Fading is used at times to bring different segments in and out when the quantization needs to be immediate. Area 2 primarily uses interchangeable layers and the horizontal resequencing of longer segments. In Area 2, states M, N, O, P, and R all contain the music of the preceding layer: N has the music of M, but adds more instruments; O has the music of N, but adds more instruments, etc. Area 3 uses a combination of additive layers and horizontal resequencing.

The construction of the DMS as a whole was made to be scalable. As previously mentioned much of the music is split into different layers. But often times, as indicated by the dotted vertical lines, all layers will always play together. The music was organized this way so that I could potentially define additional conditions to control when the layers played independently from one another. Area 3 demonstrates this feature. The events and parameters that are being tracked also enable the DMS design to be scalable. Events for game actions, statuses, and player locations are always tracked, and can be seen in the Event List document. Not all tracked parameters were used in the DMS. The purpose of designing the DMS to be scalable was so that there was no ceiling limiting the complexity required to achieve my goals. The only restricting factor was the amount of time I could afford to invest into scope-creep.

The DMSFRM shows all of the possible musical transitions and potential outputs. In the top right corner of the prototype the game tracks which state the player is currently in. This allows the player to explore the game and see what game states they have triggered. All of the individual segments have been included in a folder below. The segments can be viewed and tested within the Wwise session as well, which shows how everything is implemented.

The DMSFRM is an extensive resource to have alongside gameplay, along with the event list, audio files, and Wwise session. I believe that the prototype is a successful proof-of-concept that shows how complex musical structures can adapt granularly to player input. In retrospect, I feel that there was an opportunity to narrow the variety of different types of music and focus more on changing how specific motific material develops within a single piece. The prototype alters what music is heard and how it transitions to and from other states depending on the sequence of previous player actions. The logic processes of my DMS show that it would be possible to have a single piece of music where the development of specific motifs were largely dependent on the sequence of actions. I feel that the balance between granular transitions and structural development are generally successful. However, the DMS contains some cases where music interrupts existing phrases, and the quality of the structural output is largely dependent on the player’s pace. Using a conditional logic system in the way that I have is by no means a perfect solution. However, I intend to move forward with this type of DMS design process. I believe that the next steps towards finding a better solution will require the development of new tools that make implementing music with complex conditional logic more efficient.

Download/view prototype DMSFRM

Download/view prototype event list

Download prototype audio files, decompress them, then open
The audio files must be decompressed by using 7z. Otherwise, the folder can be found on the USB at Assets>Audio>Prototype Audio Files

Download Wwise project folder, decompress it, then open it in Wwise
The Wwise project folder must be decompressed by using 7z.

DMS Prototype - Windows
Download DynamicMusicPrototype.zip, unzip it, play it

DMS Prototype - Mac
Download DynamicMusicPrototype.zip, unzip it, play it

__________________________________________________

Previous: Chapter 4: DMS Analysis
Next: Bibliography
Contents: Back to Table of Contents